26 research outputs found

    Network Model Selection for Task-Focused Attributed Network Inference

    Full text link
    Networks are models representing relationships between entities. Often these relationships are explicitly given, or we must learn a representation which generalizes and predicts observed behavior in underlying individual data (e.g. attributes or labels). Whether given or inferred, choosing the best representation affects subsequent tasks and questions on the network. This work focuses on model selection to evaluate network representations from data, focusing on fundamental predictive tasks on networks. We present a modular methodology using general, interpretable network models, task neighborhood functions found across domains, and several criteria for robust model selection. We demonstrate our methodology on three online user activity datasets and show that network model selection for the appropriate network task vs. an alternate task increases performance by an order of magnitude in our experiments

    Honesty is the Best Policy: On the Accuracy of Apple Privacy Labels Compared to Apps' Privacy Policies

    Full text link
    Apple introduced \textit{privacy labels} in Dec. 2020 as a way for developers to report the privacy behaviors of their apps. While Apple does not validate labels, they do also require developers to provide a privacy policy, which offers an important comparison point. In this paper, we applied the NLP framework of Polisis to extract features of the privacy policy for 515,920 apps on the iOS App Store comparing the output to the privacy labels. We identify discrepancies between the policies and the labels, particularly as it relates to data collected that is linked to users. We find that 287±196\pm196K apps' privacy policies may indicate data collection that is linked to users than what is reported in the privacy labels. More alarming, a large number of (97±30\pm30\%) of the apps that have {\em Data Not Collected} privacy label have a privacy policy that indicates otherwise. We provide insights into potential sources for discrepancies, including the use of templates and confusion around Apple's definitions and requirements. These results suggest that there is still significant work to be done to help developers more accurately labeling their apps. Incorporating a Polisis-like system as a first-order check can help improve the current state and better inform developers when there are possible misapplication of privacy labels

    SoK: A Data-driven View on Methods to Detect Reflective Amplification DDoS Attacks Using Honeypots

    Full text link
    In this paper, we revisit the use of honeypots for detecting reflective amplification attacks. These measurement tools require careful design of both data collection and data analysis including cautious threshold inference. We survey common amplification honeypot platforms as well as the underlying methods to infer attack detection thresholds and to extract knowledge from the data. By systematically exploring the threshold space, we find most honeypot platforms produce comparable results despite their different configurations. Moreover, by applying data from a large-scale honeypot deployment, network telescopes, and a real-world baseline obtained from a leading DDoS mitigation provider, we question the fundamental assumption of honeypot research that convergence of observations can imply their completeness. Conclusively we derive guidance on precise, reproducible honeypot research, and present open challenges.Comment: camera-read

    Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse

    Full text link
    Domain squatting is a common adversarial practice where attackers register domain names that are purposefully similar to popular domains. In this work, we study a specific type of domain squatting called "combosquatting," in which attackers register domains that combine a popular trademark with one or more phrases (e.g., betterfacebook[.]com, youtube-live[.]com). We perform the first large-scale, empirical study of combosquatting by analyzing more than 468 billion DNS records---collected from passive and active DNS data sources over almost six years. We find that almost 60% of abusive combosquatting domains live for more than 1,000 days, and even worse, we observe increased activity associated with combosquatting year over year. Moreover, we show that combosquatting is used to perform a spectrum of different types of abuse including phishing, social engineering, affiliate abuse, trademark abuse, and even advanced persistent threats. Our results suggest that combosquatting is a real problem that requires increased scrutiny by the security community.Comment: ACM CCS 1

    One Thing Leads to Another: Credential Based Privilege Escalation

    No full text
    Abstract A user's primary email account, in addition to being an easy point of contact in our online world, is increasingly being used as a single point of failure for all web security. Features like unlimited message storage, numerous weak password reset features and economically enticing spoils (in the form of financial accounts or personal photos) all add up to an environment where overthrowing someone's life via their primary email account is increasingly likely and damaging. We describe an attack we call credential based privilege escalation, and a methodology to evaluate this attack's potential for user harm at web scale. In a study of over 9,000 users we find that, unsurprisingly, access to a vast number of online accounts can be gained by breaking into a user's primary email account (even without knowing the email account's password), but even then the monetizable value in a typical account is relatively low. We also describe future directions in understanding both the technical and human aspects of credential based privilege escalation

    Cloudsweeper: Enabling Data-Centric Document Management for Secure Cloud Archives

    Get PDF
    Cloud based storage accounts like web email are compromised on a daily basis. At the same time, billions of Internet users store private information in these accounts. As the Internet matures and these accounts accrue more information, these accounts become a single point of failure for both users ’ online identities and large amounts of their private information. This paper presents two contributions: the first, the heterogeneous documents abstraction, is a data-centric strategy for protecting high value information stored in globally accessible storage. Secondly, we present Cloudsweeper, an implementation of the heterogeneous documents strategy as a cloud-based email protection system. Cloudsweeper gives users the opportunity to remove or “lock up ” sensitive, unexpected, and rarely used information to mitigate the risks of cloud storage accounts without sacrificing the benefits of cloud storage or computation. We show that Cloudsweeper can efficiently assist users in pinpointing and protecting passwords emailed to them in cleartext. We present performance measurements showing that the system can rewrite past emails stored at cloud providers quickly, along with initial results regarding user preferences for redacted cloud storage
    corecore